Catching Super Massive Black Hole Binaries Without a Net 
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The gravitational wave signals from coalescing Supermassive Black Hole Binaries are prime targets 
for the Laser Interferometer Space Antenna (LISA). With optimal data processing techniques, the 
LISA observatory should be able to detect black hole mergers anywhere in the Universe. The 
challenge is to find ways to dig the signals out of a combination of instrument noise and the large 
foreground from stellar mass binaries in our own galaxy. The standard procedure of matched filtering 
against a grid of templates can be computationally prohibitive, especially when the black holes are 
spinning or the mass ratio is large. Here we develop an alternative approach based on Metropolis- 
Hastings sampling and simulated annealing that is orders of magnitude cheaper than a grid search. 
We demonstrate our approach on simulated LISA data streams that contain the signals from binary 
systems of Schwarzschild Black Holes, embedded in instrument noise and a foreground containing 26 
million galactic binaries. The search algorithm is able to accurately recover the 9 parameters that 
describe the black hole binary without first having to remove any of the bright foreground sources, 
even when the black hole system has low signal-to-noise. 



Supermassive Black Hole Binaries (SMBHBs) and Ex- 
treme Mass Ratio Inspirals (EMRIs) of a compact object 
into a supermassive black hole are two of the most excit- 
ing targets for the LISA observatory . Studies of these 
objects will yield insights into the role played by black 
holes in structure formation and galactic dynamics. The 
signals will also encode information about strong field, 
dynamical gravity that can be used to perform precision 
tests of general relativity 0, 0, • 

The SMBHB and EMRI signals contain a wealth of 
information that is encoded in a highly modulated time 
series composed of multiple harmonics of several distinct, 
evolving periods. The complexity of the signal is good 
news in terms of the science yield, but it poses a sig- 
nificant challenge to the data analyst. The signal from 
a binary system of structureless spinning objects, as de- 
scribed by general relativity and detected by LISA, is 
controlled by 17 parameters. In the case of SMBHBs the 
systems are expected to have circularized before entering 
the LISA band, thereby reducing the search to 15 param- 
eters. In the case of EMRIs the orbits are expected to 
maintain significant eccentricity in the LISA band, but 
the spin of the smaller body can be neglected, thereby 
reducing the search to 14 parameters. 

The large dimension of the search spaces and the high 
computational cost of generating the search templates 
make SMBHBs and EMRIs challenging targets for data 
analysis The problem only gets worse when one 

considers that we need to extract these signals from a 
timeseries that also contains the signals from millions of 
galactic binaries, and in the case of EMRIs, a possible 
self-confusion from hundreds of other EMRI systems [fl- 
it has been estimated that it would take 10 40 templates 
to perform an optimal grid search for EMRI signals 
The numbers are less for SMBHBs, but still out of reach 
computationally. Several alternative approaches have 
been discussed, including non-template based strategies 
that look for tracks in spectrograms|7j , and hierarchical, 
semi-coherent grid based searches 5]. Here we consider 
an alternative approach that uses Metropolis-Hastings 



sampling and simulated annealing to search through the 
space of templates. Our search method is closely related 
to the Markov Chain Monte Carlo (MCMC) HQ method 
that is used to explore the posterior distribution of the 
model parameters once the source has been located. In 
previous work the MCMC approach was used to test the 
Fisher matrix predictions for SMBHB parameter uncer- 
tainties by starting the chains off very close to true source 
parameters ^3- It was found that even the more 
sophisticated adaptive Reverse Jump MCMC algorithm 
performed poorly when searching large regions of param- 
eter space. We have found the non-Markovian sampling 
employed by our algorithm to be many orders of magni- 
tude faster than the MCMC search algorithms that have 
been investigated to-date. Advanced MCMC techniques 
that employ importance resampling and well designed 
priors have been used to study 5-parameter binary inspi- 
ral signals in the context of ground based gravitational 
wave detectors 12]. It would be very interesting to see 
how this algorithm performs in the LISA context. We ap- 
ply our search algorithm to simulated LISA data streams 
that include the signals from a pair of non-rotating black 
holes and a foreground produced by galactic white dwarf 
binaries. While the SMBHB system we consider is sim- 
pler than the general case (the model is described by 9 
parameters rather than 15), it serves to illustrate the rel- 
ative economy of the gridless approach. 

The gravitational waveform for a supermassive black 
hole system consisting of two Schwarzschild black holes 
is described by 9 parameters: the redshifted chirp mass, 
M c ; the redshifted reduced-mass, /u; the sky location, 
(6,4>); the time-to-coalescence, t c ; the inclination of the 
orbit of the binary, i\ the phase of the wave at coales- 
cence, if c ; the luminosity distance, Dr,; and the polariza- 
tion angle, ip. The parameters M c ,[i are intrinsic to the 
system, while Dl, l, ip c , ip are extrinsic as they depend on 
the perspective of the observer. The other three param- 
eters, (0, tfi) and t c , would be extrinsic if the LISA obser- 
vatory were static, but the motion of the detector couples 
these parameters to the intrinsic evolution. The extrinsic 
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parameters Dl, L,tp c ,ip can be analytically solved for us- 
ing a generalized F-statistic 0] , leaving a 5 dimensional 
search space. 

We illustrate the performance of the search algo- 
rithm by considering two representative LISA sources 
- a 10 6 — 10 5 Mq binary system at z = 1 and a 
10 5 — 5 x 10 4 Mq binary system at z = 5.5. In each 
case the time of observation is 6 months, and the 
observations end <~ 1 week prior to merger. The early 
termination of the signal is designed to demonstrate 
LISA's ability to give early warning to other tele- 
scope facilities. The 2 = 1 example has parameters 
(M C /M Q , /i/Mq, Z?i/Gpc, i c /months, 9, </>, t, tp c , V) = 
(4.93xl0 5 ,1.82xl0 5 , 6.6, 6.23, 1.325,2.04, 1.02,0.95,0.66) 
and the z — 5.5 example has parameters 
(M c /Mq, fi/M e ,D L /Gpc, i c /months, 9, <j), i, <p c , </0 
(3.95xl0 5 , 2.17x 10 5 , 53, 6.25, 1.927, 0.351, 1.318, 2.0, 0.23) 
To make the searches more realistic, we add in a galac- 
tic foreground consisting of approximately 26 million 
galactic sources. The galactic binary foreground is gen- 
erated using a Nelemans, Yungelson and Zwart galaxy 
model [lH fl5| . The signal-to-noise ratio (SNR) for the 
sources is estimated using the combined instrument 
and galactic confusion noise. The z = 1 example has 
SNR = 118.0 and the z = 5.5 example has SNR = 9.87. 
These SNR ratios are on the low side for typical LISA 
observations of SMBHBs as we terminate the observa- 
tions a week before merger. The full inspiral signals 
would give SNRs of ~ 387 and ~ 182 for the two cases, 
and the merger and ringdown signals would further 
boost the SNRs by a factor of ~ 2 or more. In Fig ^ we 
plot the detector response to the galactic foreground and 
instrument noise, along with the noise-free response to 
the SMBHB signals. We use restricted post-Newtonian 
waveforms with 2-PN evolution of the phase and we 
employ the two independent interferometry channels 
that are available at low frequencies |16j . As the 
equations that describe the phase evolution break down 
before we reach the last stable circular orbit at R = 6M, 
we terminate the search templates at a maximum value 
of R = 7M. For the sources in question, the observation 
period terminates a week from coalescence, so the 
maximum gravitational wave frequency reached is 0.28 
mHz for the 2 = 1 example and 0.32 mHz for the z = 5.5 
example. With this frequency range, the SMBHBs 
overlap with over 22.5 million galactic binaries. To 
minimize the computational cost, the search templates 
were generated at a sample cadence of 4.2 mHz. 

Our search algorithm uses Metropolis-Hastings rejec- 
tion sampling, simulated annealing and algebraic ex- 
tremization over extrinsic and quasi-extrinsic parame- 
ters. The sampling proceeds as follows: Choose a random 
starting point x in parameter space. Using a proposal 
distribution q(-\x), draw a new point y. Evaluate the 
Hastings ratio 

H = Tr(y)p(s\y)q{x\y) 
TT(x)p(s\x)q{y\x) ' 
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FIG. 1: The strain spectral density in a single LISA channel. 
The grey line is the LISA response to a galactic background 
of 26 million sources plus simulated instrumental noise. The 
solid black lines show the LISA response to the SMBHB sig- 
nals alone. The dashed black line indicates the RMS instru- 
ment plus galactic confusion noise level. 

Accept the candidate point y with probability a = 
min(l, H), otherwise remain at the current state x. Here 
7r(x) are the priors on the parameters, 

p(s\x) = const. e -<«-fc(*0l«-*(i0>/3 j ( 2 ) 

is the likelihood and q(x\y) is the proposal distribution. 
The angular brackets (s— h(x)\s — h(x)) denote the stan- 
dard noise weighted inner product of the signal s minus 
the template h(x). We employ three different proposal 
distributions that are designed to give small, medium 
and large jumps. This mixture of jump sizes gives the 
search the flexibility to fully explore the parameter space 
and the ability to quickly hone in on promising regions. 
The small jumps are drawn from a multi-variate Normal 
distribution, the medium sized jumps are given by a uni- 
form draw of ± 10cr in each parameter and the large jumps 
come from a full range, uniform draw on all the param- 
eters. We used a mixture of 20 small jump proposals for 
every medium or large jump proposal. Correlations be- 
tween the parameters can seriously hurt the acceptance 
rate, so we use a multi-variate Normal distribution that 
is the product of Normal distributions in each eigendi- 
rection of the Fisher information matrix, Tij{x). The 
standard deviation in each eigendirection is set equal to 
o", = l/y/DEi, where D = 9 and Ej is the corresponding 
eigenvalue of the Fisher matrix |l9j . The Fisher matrix 
is also used to scale the medium size jumps. 

The simulated annealing is done by multiplying the 
noise weighted inner product (s\h) by an inverse temper- 
ature p. We used a standard power-law cooling schedule: 

f 1Q B(l-i/N c ) Q<i< Nc 

P={ , (3) 

I 1 i > N c 
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where i is the number of steps in the chain and N c is the 
number of steps the chain takes to reach the normal tem- 
perature. We found that an initial heat factor of between 
10 to 100 and a cooling schedule that lasted for ~ 10 4 
steps worked well, but the performance was not partic- 
ularly sensitive to these choices. For low SNR sources 
smaller initial heat factors and slightly longer cooling 
schedules yielded better results. 

The F-statistic is used to automatically extremize over 
the four parameters (Dl, t, ip, fc), but the motion of the 
LISA detector sets a time reference, so the usual trick of 
using a fast Fourier transform to extremize over the time 
to coalescence, t c , is not strictly permitted. However, 
the waveforms are much less sensitive to the sky location 
than they are to t c , so we employed t c maximization dur- 
ing the annealing phase for the large and medium jump 
proposals. This procedure biases the solution, but the 
bias is erased by subsequent jumps. 

In dozens of tests applied to many different examples, 
our search algorithm never failed to detect the SMBHB 
signals. On occasions the chain would lock onto a sec- 
ondary maxima of the likelihood function, but this be- 
haviour can be heavily suppressed by using longer cooling 
schedules. Once the annealing phase is complete the t c 
maximization is turned off and our search algorithm be- 
comes a standard Markov Chain Monte Carlo (MCMC) 
algorithm for exploring the posterior distribution func- 
tion. The MCMC method is a multi-purpose approach 
that can be used to perform model comparisons, esti- 
mate instrument noise, and provide error estimates for 
the recovered parameters The method is now in 

widespread use in many fields, and is starting to be used 
by astronomers and cosmologists. MCMC techniques 
have been applied to ground based gravitational wave 
data analysis |17| ; a toy LISA problem 0] ! an d the ex- 
traction of multiple overlapping galactic binaries from 
simulated LISA data [Tsj . 

For the example at z — 1 we use the following uniform 
priors in our search: we choose the mass ratio to lie be- 
tween 5 and 15, the redshifted total mass between 5 x 10 5 
and 5 x 10 6 solar masses, t c is chosen to lie within 3 and 
9 months, and 9 and cj> are drawn from a uniform sky 
distribution. The initial heat was set at 100 (B = —2) 
and the annealing lasted for N c = 10, 000 steps. The 
search took three hours to run on a single 2 GHz pro- 
cessor. In Fig. |21 we plot a representative search chain. 
Because the search algorithm locks onto the source in 
N ~ 1000 steps, we use a logarithmic scale for the num- 
ber of iterations, N. The t c maximization allows the 
search to hone in on M c and t c very quickly. The re- 
duced mass /i is less well constrained and takes a little 
longer to lock in, and the sky location gets fixed last of 
all. The extrinsic parameters Dl,l are recovered once 
the sky location is determined, while "0 and tp c continue 
to explore their full range throughout the evolution. The 
failure to fix ip and ip c is consistent with the Fisher ma- 
trix predictions for the uncertainties in these parame- 
ters. The errors in the recovered search parameters were: 
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FIG. 2: A plot of the search chains for the five intrinsic pa- 
rameters and the SNR. In all cases, the straight solid line 
represents the true values of the SMBHB parameters. After 
N = N c = 10000 steps the search becomes a standard MCMC 
exploration of the posterior distribution function. 



AM C = -42M = -0.154c; A/x = 729M Q = 0.147c; 
At c = 76 s = -0.171cr; A9 = 0.82° = 1.06cr; and 
Acf> = 0.65° = —1.28c; where the standard deviations 
were determined from the MCMC portion of the chains. 

We found that our gridless search algorithm is able to 
reliably identify the SMBHB signal within ~ 1, 000 steps. 
We have calculated that it would take 9.3x 10 12 templates 
to cover the same search range with an F-statistic based 
grid search at a minimal match level of 0.9 20]. The 
comparison is not entirely fair since we also used an illegal 
maximization over t c during the annealing phase, but we 
have verified that the annealed chains are able to find 
the SMBHB signal without this trick, it just takes 10 
to 100 times longer. Either way, our search algorithm 
is significantly more economical than a naive grid based 
search. 

For the example at z — 5.5 we use the following uni- 
form priors in our search: we choose the mass ratio to 
lie between 1 and 5, the redshifted total mass between 
2 x 10 5 and 2 x 10 6 solar masses, t c is chosen to lie within 
5 and 7 months and 9 and <fi are drawn from a uniform 
sky distribution. The initial heat was set at 10 (B = —1) 
and the annealing phase lasted for N c = 20, 000 steps. In 
Fig. |21 we plot the SNR evolution for three runs. Two of 
these runs happened to lock onto an alternative solution 
for the sky location that exist because of the approximate 
symmetry cf) — ► <p + it and 9 — ► it — 9 that holds for the 
low frequency LISA response function. Since the two so- 
lutions for the sky position have almost equal likelihood, 
the bimodality of the solution is a feature, rather than a 
flaw, of the search algorithm. As might be expected, the 
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FIG. 3: Examples of the SNR evolution for three runs search- 
ing for the low SNR signal at redshift z = 5.5. The chains 
typically found, then lost the signal on several occasions 
early in the runs due to the high temperature and low SNR. 
The chains usually locked on for good at around iteration 
N ~ 12000. 



search algorithm takes longer to lock onto weak sources 
than strong sources, but the run times are still measured 
in hours, not days. 



Here we have shown that it is possible to dig a SMBHB 
signal out from under instrument noise and the signals 
from foreground sources. The errors in the recovered pa- 
rameters are consistent with a Fisher matrix prediction 
that treats the galactic foreground as an addition source 
of Gaussian noise. We will present a detailed study of 
detection threshold and the posterior distributions in the 
presence of galactic foregrounds in a future publication. 
The next step is to apply the same techniques to the 
more complicated signals from spinning SMBHB 's and 
EMRIs. The larger parameter spaces are not expected 
to pose a problem as the search cost is expected to scale 
linearly with the search dimension. Indeed, it should be 
possible to simultaneously search for multiple, overlap- 
ping EMRI signals. We consider our current work as a 
proof-of-principle that the LISA data analysis challenge 
can be addressed with modest computational resources. 
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